Making sense of collocations q

نویسندگان

  • Leo Wanner
  • Bernd Bohnet
  • Mark Giereth
چکیده

Lexico-semantic collocations (LSCs) are a prominent type of multiword expressions. Over the last decade, the automatic compilation of LSCs from text corpora has been addressed in a significant number of works. However, very often, the output of an LSC-extraction program is a plain list of LSCs. Being useful as raw material for dictionary construction, plain lists of LSCs are of a rather limited use in NLP-applications. For NLP, LSCs must be assigned syntactic and, especially, semantic information. Our goal is to develop an ‘‘off-the-shelf’’ LSC-acquisition program that annotates each LSC identified in the corpus with its syntax and semantics. In this article, we address the annotation task as a classification task,viewing it as a machine learning problem. The LSC-typology we use are the lexical functions from the Explanatory Combinatorial Lexicology; as lexico-semantic resource, EuroWordnet has been used. The applied machine learning technique is a variant of the nearest neighbor-family, which is defined over lexico-semantic features of the elements of LSCs. The technique has been tested on Spanish verb–noun bigrams. 2005 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Class-based collocations for Word Sense Disambiguation

This paper describes the NMSU-Pitt-UNCA word-sense disambiguation system participating in the Senseval-3 English lexical sample task. The focus of the work is on using semantic class-based collocations to augment traditional word-based collocations. Three separate sources of word relatedness are used for these collocations: 1) WordNet hypernym relations; 2) cluster-based word similarity classes...

متن کامل

Discourse Community Collocations and L2 Writing Content

Taking the position that writing can be an important skill to foster knowledge building pedagogy, this article explores vocabulary as a supportive tool for this purpose. Having this in mind, a compilation of conceptually loaded vocabularies pertaining to seven discourse communities was developed, two of which were given to a group of L2 writers to investigate the implications of phraseology for...

متن کامل

Discriminative Ability of WordNet Senses on the Task of Detecting Lexical Functions in Spanish Verb Noun Collocations

Collocations, or restricted lexical co-occurrence, are a difficult issue in natural language processing because their semantics cannot be derived from the semantics of their constituents. Therefore, such verb-noun combinations as “take a break,” “catch a bus,” “have lunch” can be interpreted incorrectly by automatic semantic analysis. Since collocations are combinations frequently used in texts...

متن کامل

Preposition Semantic Classification via Treebank and FrameNet

This paper reports on experiments in classifying the semantic role annotations assigned to prepositional phrases in both PENN TREEBANK (version II) and FRAMENET (version 0.75). In both cases, experiments are done to see how the prepositions can be classified given the dataset’s role inventory, using standard word-sense disambiguation features, such as the parts of speech of surrounding words, a...

متن کامل

Associating Collocations with WordNet Senses Using Hybrid Models

In this paper, we introduce a hybrid method to associate English collocations with sense class members chosen from WordNet. Our combinational approach includes a learning-based method, a paraphrase-based method and a sense frequency ranking method. At training time, a set of collocations with their tagged senses is prepared. We use the sentence information extracted from a large corpus and cros...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005